Search CORE

72 research outputs found

The Resource Usage Aware Backfilling

Author: B.G. Lawson
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
E. Shmueli
J. Skovira
M. Calzarossa
M. Calzarossa
S.-H. Chiang
S.J. Chapin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Abstract. Job scheduling policies for HPC centers have been extensively stud-ied in the last few years, especially backfilling based policies. Almost all of these studies have been done using simulation tools. All the existent simulators use the runtime (either estimated or real) provided in the workload as a basis of their sim-ulations. In our previous work we analyzed the impact on system performance of considering the resource sharing (memory bandwidth) of running jobs including a new resource model in the Alvio simulator. Based on this studies we proposed the LessConsume and LessConsume Threshold resource selection policies. Both are oriented to reduce the saturation of the shared resources thus increasing the performance of the system. The results showed how both resource allocation poli-cies shown how the performance of the system can be improved by considering where the jobs are finally allocated. Using the LessConsume Threshold Resource Selection Policy, we propose a new backfilling strategy: the Resource Usage Aware Backfilling job scheduling policy. This is a backfilling based scheduling policy where the algorithms which decide which job has to be executed and how jobs have to be backfilled are based on a different Threshold configurations. This backfilling variant that considers how the shared resources are used by the scheduled jobs. Rather than backfilling the first job that can moved to the run queue based on the job arrival time or job size, it looks ahead to the next queued jobs, and tries to allocate jobs that would experience lower penalized runtime caused by the resource sharing saturation. In the paper we demostrate how the exchange of scheduling information between the local resource manager and the scheduler can improve substantially the per-formance of the system when the resource sharing is considered. We show how it can achieve a close response time performance that the shorest job first Back-filling with First Fit (oriented to improve the start time for the allocated jobs) providing a qualitative improvement in the number of killed jobs and in the per-centage of penalized runtime.

CiteSeerX

Crossref

Energy-Aware Lease Scheduling in Virtualized Data Centers

Author: A. Beloglazov
D.G. Feitelson
L.A. Barroso
R. Buyya
R. Panigrahy
S. Albers
X. Fan
Publication venue
Publication date: 28/10/2014
Field of study

Energy efficiency has become an important measurement of scheduling algorithms in virtualized data centers. One of the challenges of energy-efficient scheduling algorithms, however, is the trade-off between minimizing energy consumption and satisfying quality of service (e.g. performance, resource availability on time for reservation requests). We consider resource needs in the context of virtualized data centers of a private cloud system, which provides resource leases in terms of virtual machines (VMs) for user applications. In this paper, we propose heuristics for scheduling VMs that address the above challenge. On performance evaluation, simulated results have shown a significant reduction on total energy consumption of our proposed algorithms compared with an existing First-Come-First-Serve (FCFS) scheduling algorithm with the same fulfillment of performance requirements. We also discuss the improvement of energy saving when additionally using migration policies to the above mentioned algorithms.Comment: 10 pages, 2 figures, Proceedings of the Fifth International Conference on High Performance Scientific Computing, March 5-9, 2012, Hanoi, Vietna

arXiv.org e-Print Archive

Crossref

Recommended from our members

Improved utilization and responsiveness with gang scheduling

Author: Feitelson D.G.,
Jette M.A.
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/10/1996
Field of study

Most commercial multicomputers use space-slicing schemes in which each scheduling decision has an unknown impact on the future: should a job be scheduled, risking that it will block other larger jobs later, or should the processors be left idle for now in anticipation of future arrivals? This dilemma is solved by using gang scheduling, because then the impact of each decision is limited to its time slice, and future arrivals can be accommodated in other time slices. This added flexibility is shown to improve overall system utilization and responsiveness. Empirical evidence from using gang scheduling on a Cray T3D installed at Lawrence Livermore National Lab corroborates these results, and shows conclusively that gang scheduling can be very effective with current technology. 29 refs., 10 figs., 6 tabs

UNT Digital Library

Coscheduling under Memory Constraints in a NOW Environment

Author: D. Burger
D.G. Feitelson
D.G. Feitelson
F. Solsona
K.Y. Wang
W. Leinberger
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Multiple-queue backfilling scheduling with priorities and reservations for parallel systems

Author: Barry G. Lawson
Bode B.
Evgenia Smirni
Feitelson D.G.
Lawson B. G.
Perkovic D.
Talby D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Analyzing the EGEE production grid workload: application to jobs submission optimization

Author: A. Iosup
C. Germain
D. Lingrand
D.G. Feitelson
E. Frachtenberg
E. Medernach
H. Li
K. Christodoulopoulos
T. Glatard
T. Glatard
Publication venue: Springer
Publication date: 01/01/2009
Field of study

International audienceGrids reliability remains an order of magnitude below clusters on production infrastructures. This work is aims at improving grid application performances by improving the job submission system. A stochastic model, capturing the behavior of a complex grid workload management system is proposed. To instantiate the model, detailed statistics are extracted from dense grid activity traces. The model is exploited in a simple job resubmission strategy. It provides quantitative inputs to improve job submission performance and it enables quantifying the impact of faults and outliers on grid operations

Crossref

HAL-UNICE

Web-Scale Job Scheduling

Author: A. Nissimov
A.C. Sodan
A.K. Mishra
B. Atikoglu
B. Schroeder
D. Borthakur
D. Jackson
D. Meisner
D.G. Andersen
D.G. Feitelson
D.G. Feitelson
E. Frachtenberg
E. Frachtenberg
G. Holt
G. Sabin
H. Raj
J. Dean
J. Leverich
K. Xiong
L.A. Barroso
T. Jones
T. White
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

An Extended Evaluation of Two-Phase Scheduling Methods for Animation Rendering

Author: D.A. Lifka
D.G. Feitelson
E. Anderson
E. Heymann
H. Kellerer
J. Krallmann
J. Sgall
M. Pinedo
P. Brucker
R. Graham
R. Graham
R.P. Brent
T.L. Adam
Y.-K. Kwok
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

The Influence of the Structure and Sizes of Jobs on the Performance of Co-Allocation

Author: D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
J. Patton Jones
K. Aida
T.B. Brecht
Publication venue: Springer-Verlag
Publication date: 01/01/2000
Field of study

Over the last decade,much research in the area of scheduling has concentrated on single-cluster systems. Less attention has been paid to multicluster systems, although they are gaining more and more importance in practice. We propose a model for scheduling rigid jobs consisting of multiple components in multicluster systems by pure space sharing, based on the Distributed ASCI Supercomputer. Using simulations, we asses the influence of the structure and sizes of the jobs on the system’s performance, measured in terms of the average response time and the maximum utilization. We consider three types of requests, total requests, unordered requests and ordered requests, and compare their effect on the system’s performance for two scheduling policies, First Come First Served, and Fit Processors First Served, which allows the scheduler to look further in the queue for jobs that fit. These types of job requests are differentiated by the restrictions they impose on the scheduler and by the form of co-allocation used. The results show that the performance improves with decreasing average job size and when fewer restrictions are imposed on the scheduler.

Scaling of workload traces

Author: A.B. Downey
A.B. Downey
A.W. Mu’alem
B.G. Lawson
C. Ernemann
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
D.G. Feitelson
J. Jann
J. Krallmann
S. Hotovy
U. Schwiegelshohn
V. Lo
Publication venue: Springer-Verlag
Publication date: 01/01/2003
Field of study

Abstract — The design and evaluation of job scheduling strategies often require simulations with workload data or models. Usually workload traces are the most realistic data source as they include all explicit and implicit job patterns which are not always considered in a model. In this paper, a method is presented to enlarge and/or duplicate jobs in a given workload. This allows the scaling of workloads for later use on parallel machine configurations with a different number of processors. As quality criteria the scheduling results by common algorithms have been examined. The results show high sensitivity of schedule attributes to modifications of the workload. To this end, different strategies of scaling number of job copies and/or job size have been examined. The best results had been achieved by adjusting the scaling factors to be higher than the precise relation between the new scaled machine size and the original source configuration. I

CiteSeerX

Crossref